Lecture 7 : Nearest Neighbor Search , Locally Sensitive Hashing
نویسندگان
چکیده
If d = 1, we can simply sort the points and do a binary search; so we can answer the query in time O(log n). We can extend this idea to d dimensions; this leads to famous data structures known as k-d trees and quad trees. Unfortunately, the memory that we need for any of these data structures grows exponentially in d, i.e., n. So, even if d is about 30 it is impossible to run these algorithms on today’s computers.
منابع مشابه
Developing a Good Hash Function for LSH
In the previous lecture we saw how to design locality sensitive hashing (LSH) for hamming and l1 distance, as a solution to the (c,R)-Near Neighbor problem. Whenever we use LSH for the nearest neighbor search, using some distance measure, the main task is to come up with a good elementary hashing function. This elementary hash function (H) is then used to create a composite hash function (G) fo...
متن کاملlsh, Nearest neighbor search in high dimensions
Calculating distance pairs is O(n2) in memory and time and finding the nearest neighbor is O(n) in time. Tree indexing techniques like kd-tree [2] were developed to cope with large n, however their performance quickly breaks down for p > 3 [3]. Locality sensitive hashing (LSH) [3] is a technique for generating hash numbers from high dimensional data, such that nearby points have identical hashe...
متن کاملLocality-Sensitive Hashing for Data with Categorical and Numerical Attributes Using Dual Hashing
Locality-sensitive hashing techniques have been developed to efficiently handle nearest neighbor searches and similar pair identification problems for large volumes of high-dimensional data. This study proposes a locality-sensitive hashing method that can be applied to nearest neighbor search problems for data sets containing both numerical and categorical attributes. The proposed method makes ...
متن کاملSK-LSH: An Efficient Index Structure for Approximate Nearest Neighbor Search
Approximate Nearest Neighbor (ANN) search in high dimensional space has become a fundamental paradigm in many applications. Recently, Locality Sensitive Hashing (LSH) and its variants are acknowledged as the most promising solutions to ANN search. However, state-of-the-art LSH approaches suffer from a drawback: accesses to candidate objects require a large number of random I/O operations. In or...
متن کاملHashing Image Patches for Zooming
In this paper we present a Bayesian image zooming/super-resolution algorithm based on a patch based representation. We work on a patch based model with overlap and employ a Locally Linear Embedding (LLE) based approach as our data fidelity term in the Bayesian inference. The image prior imposes continuity constraints across the overlapping patches. We apply an error back-projection technique, w...
متن کامل